“There’d be no music without the words…” — Bob Dylan (1965)
Enabling the Study of Lyric-Music Correspondence in Song:
Encoding Linguistic Annotations in Humdrum Scores
1 School of Music, Georgia Institute of Technology
Lyrics and vocals are a prominent and highly-valued element in the world’s most popular music genres. Unfortunately, the Western tradition of intellectual music theory and practice has often emphasized instrumental music.
Counts of keywords referenced in ICCCM abstract titles and bodies.
| TOTAL N | “lyric” | “word” | “text” | “syllable” | “syntax” (language) | “harmony” | “chord” | “tonality” | “note” | “pitch” | “rhythm” | “meter” | “syntax” (music) | |
| 2023 | 7 | 0 ; 0 | 0 ; 0 | 0 ; 0 | 0 ; 0 | 0 ; 0 | 1 ; 3 | 0 ; 0 | 0 ; 1 | 0 ; 2 | 0 ; 2 | 0 ; 1 | 0 ; 1 | 0 ; 1 |
| 2024 | 30 | 0 ; 3 | 0 ; 2 | 0 ; 1 | 0 ; 2 | 0 ; 0 | 1 ; 4 | 1 ; 2 | 1 ; 3 | 0 ; 6 | 0 ; 4 | 2 ; 4 | 1 ; 2 | 0 ; 0 |
| 2025* | 47 | 0 ; 2 | 0 ; 1 | 0 ; 2 | 1 ; 2 | 0 ; 0 | 4 ; 11 | 1 ; 10 | 4 ; 12 | 1 ; 20 | 5 ; 18 | 3 ; 10 | 1 ; 3 | 0 ; 2 |
* = excluding my abstract. Formatting is “N in title ; N in body”.
Lyrics are made up of “words,” which are units of (semantic) meaning;
Written language/lyrics is focused on words, with prosodic and phonetic information underspecified or ignored.
Orthography \(\neq\) Pronunciation \(\neq\) Prosody
Western scores have adopted orthographic conventions for representing lyrics, and their relationship to musical events. However, these conventions are not consistent across time, publishers, or languages. For example, punctuation is often used ad hoc to represent musical/prosodic/syntactic units.
Punctuation \(\neq\) Syntax \(\neq\) Prosody
The most reliable prosodic information in lyric data is the syllable structure of multi-syllable words.
Consider two different encodings of a famous lyric—can you spot the differences?:
1a. I see a bad moon a ri-sin’
2a. I see trou-ble on the way
1b. I see, the Bad Moon a-ris-ing
2a. I see, troub-le on the way.
Lyric data often features these sorts of inconsistencies from piece-to-piece, and line-to-line.
In the following analyses, I consider the shared information between “musical” and “lyrical” features in three datasets of sung music. This illustrates the degree to which lyric data can shed light on other aspects of the music.
up, down, same)±M2)#4)16, 8, 4, 2).true or false)single syllable, first syllable, last syllable, middle syllable)new syllable, melisma)none, ,, .)I compute “3-grams” of all features.
Here, I show the mutual information (shared entropy) of each pair of features as a proportion of the joint entropy (rounded to two decimal places).
This is calculated independently within each piece/movement, and the minimum–mean–maximum is shown.
| Contour | Melodic interval | Scale degree | Duration | Procedes rest | Position in word | Melisma | Capitalized | Punctuation | |
|---|---|---|---|---|---|---|---|---|---|
| Melodic interval | .54–.7–.85 | ||||||||
| Scale degree | .28–.5–.66 | .53–.7–.86 | |||||||
| Duration | .07–.3–.83 | .16–.3–.69 | .13–.3–.79 | ||||||
| Procedes rest | .01–.3–.62 | .01–.2–.51 | .01–.2–.45 | .02–.3–.58 | |||||
| Position in word | .1–.2–.65 | .13–.3–.55 | .12–.3–.48 | .04–.2–.57 | .03–.2–.96 | ||||
| Melisma | 0–0–.03 | 0–0–.02 | 0–0–.02 | 0–0–.04 | 0–0–1 | .01–0–.04 | |||
| Capitalized | .02–.1–.54 | .02–.1–.44 | .02–.1–.37 | .02–.1–.47 | .02–0–.31 | .02–.1–.34 | .01–0–.27 | ||
| Punctuation | 0–0–.22 | 0–0–.23 | 0–0–.23 | 0–0–.45 | 0–0–1 | .01–0–.18 | .01–.5–1 | .01–.1–.27 | |
| Vowel | .16–.3–.48 | .25–.4–.69 | .22–.4–.64 | .11–.2–.62 | .01–.1–.27 | .12–.3–.62 | 0–0–.08 | .02–.1–.62 | 0–0–.18 |
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
| Contour | Melodic interval | Scale degree | Duration | Procedes rest | Position in word | Melisma | Capitalized | Punctuation | |
|---|---|---|---|---|---|---|---|---|---|
| Melodic interval | .48–.7–.91 | ||||||||
| Scale degree | .18–.4–.77 | .41–.6–.93 | |||||||
| Duration | .07–.2–.92 | .12–.3–.78 | .08–.2–.7 | ||||||
| Procedes rest | .05–.2–.52 | .04–.2–.44 | .02–.1–.4 | .04–.2–1 | |||||
| Position in word | .06–.2–1 | .11–.3–.85 | .05–.2–.77 | .04–.2–.92 | .04–.1–.52 | ||||
| Melisma | 0–0–.15 | 0–0–.13 | 0–0–.12 | .01–0–.22 | .02–.1–.34 | .01–0–.34 | |||
| Capitalized | .01–.1–.31 | .01–.1–.37 | .01–.1–.35 | .02–.1–.4 | .02–.1–.28 | .02–.1–.31 | .02–.2–1 | ||
| Punctuation | .04–.2–.53 | .07–.2–.47 | .04–.1–.39 | .05–.2–.82 | .07–.3–.82 | .05–.1–.4 | .02–.1–.5 | .02–.1–.67 | |
| Vowel | .1–.3–1 | .17–.5–1 | .1–.5–.92 | .09–.3–.92 | .04–.2–.52 | .12–.3–1 | 0–0–.15 | .01–.1–.4 | .06–.2–.47 |
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
| Contour | Melodic interval | Scale degree | Duration | Procedes rest | Position in word | Melisma | Capitalized | Punctuation | |
|---|---|---|---|---|---|---|---|---|---|
| Melodic interval | .42–.6–.83 | ||||||||
| Scale degree | .16–.4–.73 | .37–.7–.91 | |||||||
| Duration | .05–.2–.47 | .11–.3–.55 | .08–.2–.51 | ||||||
| Procedes rest | .04–.2–.55 | .02–.1–.38 | .02–.1–.45 | .04–.2–.74 | |||||
| Position in word | .05–.2–.44 | .12–.3–.53 | .07–.2–.52 | .04–.1–.32 | .03–.1–.24 | ||||
| Melisma | 0–0–.07 | 0–0–.04 | 0–0–.05 | 0–0–.12 | .01–.1–1 | .01–0–.1 | |||
| Capitalized | .01–.1–.29 | .02–.1–.37 | .01–.1–.28 | .02–.1–.31 | .01–.1–.26 | .03–.1–.39 | .02–.1–.5 | ||
| Punctuation | .03–.1–.27 | .05–.2–.37 | .04–.1–.32 | .04–.1–.33 | .02–.3–.92 | .06–.1–.49 | .02–.1–.19 | .03–.1–.28 | |
| Vowel | .1–.3–.63 | .2–.5–.82 | .09–.4–.82 | .07–.2–.44 | .03–.1–.27 | .2–.4–.78 | .01–0–.06 | .03–.2–.42 | .08–.2–.63 |
Additional tables/analyses are shown on the
sungdrumrepo.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
The following can be considered recommendations for the creation and/or curation of song datasets—and in parallel, as warnings of potential pitfalls for when analyzing lyric data.
More detailed recommendations, encoding schemes, and analysis scripts are posted in my sungdrum repository.
Currently, sungdrum is focused on English language, but the principles are broadly applicable.
The provenence of lyrics should be considered and encoded in metadata. Distinguish the follwing (sub)categories:
The possibility of ambiguity or disagreement about lyrics is present in all forms, and must be considered.
It is ideal to decouple independent aspects of lyrics/sung language.
because never 'cause or coz.The sungdrum repository includes detailed specifications for encoding lyric/linguistic information in humdrum-syntax data (Huron 1999).
The repo also includes scripts for parsing and parsing humdrum lyric data, including automatic syntactic labeling via the Stanford Dependency parser (Chen and Manning 2014).
“There’d be no music without the words…” — Bob Dylan (1965)